Monday, October 20, 2014

Introduction

  • Why does causality matter for policy making?
  • Why do we like experiments?
  • Why do we hate experiments?
  • Turning a weakness into a strength
  • What does policy analysis look like in the real world?

Causality Matters

  • But politics has a way of trying to make it seem so

But it's not everything

  • Decisions have to be made with best evidence available
  • Ruling out alternative explanations generally comes with increased costs
  • Scientific process downplays the subjective judgments of domain experts in favor of more expensive objective measurement
  • May

Experiments, Yay!

Experiments Do

  • Provide solid internal reliability when done well
  • Provide idealized estimates of impact
  • Often test interventions in a best-case scenario
  • Provide clean data with transparent analyses to draw conclusions

Experiments, Boo!

Experiments Do Not

  • Have external reliability. Does the lab resemble the real world?
  • Often ignore the importance of precise implementation and often fail to measure the variability in implementation that will occur.
  • Do not help understand the reasons program attrition might occur
  • Are limited in their ability to test the variability in response to treatment under different conditions – recruiting and retaining subjects is $$

Experiments in Policy

  • A randomized controlled trial (RCT) in policy is rarely possible and rarely desirable
  • RCTs require a level of control and authority that regulators rarely possess
  • RCTs require a level of monitoring and tracking that may not be appropriate for a governmental entity
  • Recruiting and retaining subjects is expensive and can substantially add to the bill of a policy
  • Lack of external validity makes it hard to say "Program X worked in the trial, let's go to scale."

An Aside on the Policy Process

John Conway's Game of Life

  • Simple rules in large systems create emergent properties that are complex and unpredictable
  • A simple example is John Conway's Game of Life, played on a two-dimensional grid
  • Any live cell with fewer than two neighbors dies
  • Any live cell with two or three neighbors lives
  • Any live cell with more than three live neighbors dies
  • Any dead cell with exactly three live neighbors becomes a live cell
  • These simple patterns result in emergent patterns that stabilize in unpredictable but orderly ways

Results

Policy is Not Unlike This

  • Start with simple rules
  • Add more simple rules
  • Regulated entities react to these rules
  • Observe emergent properties
  • Repeat

A Thought Experiment

  • Consider education policy for a moment
  • An experiment done in 10 school districts statewide shows promising test score improvements for 5th grade students in reading with a sample of almost 4,000 students
  • The intervention is a 6 week reading program focused on badgers and balloons
  • The program randomly assigned students to receive the treatment, or to receive their regular classroom instruction

What are some threats to external validity here?

  • What is the control group?
  • How were the schools selected to participate?

Challenge

Even if we get the experiment perfect, the evidence it generates is not sufficient to guarantee success at scale.

Non-experimental methods impose a lower cost and lower risk, while providing evidence useful in discussions about how to scale.

How to spot the opportunity?

  • Pilots and phase-ins
  • Budget cuts and phase-outs
  • Policy resistance
  • Policy shifts due to elections (changing priorities)
  • Philanthropy
  • Local innovation
  • Natural experiments (boundary changes, closures, etc.)

Design vs. Serendipity

  • Sometimes we have the privilege of designing an quasi-experimental evaluation into a public policy
  • Most of the time we have the joy of trying to find a quasi-experimental hook to retrospectively evaluate an existing program
  • Designed evaluation is preferable, but the sell can be hard
  • Retrospective evaluation may not always be possible or only the least rigorous methods may apply

Why? Linked Administrative Systems

Why? Longitudinal Records

  • If we record everything, we can use instant reply to evaluate policy changes retrospectively

plot of chunk unnamed-chunk-2

Why? Big Data

Quasi-Experiment Examples in the Wild

  • Regression discontinuity through ELL reclassification
  • Fixed effects regression through BLBC program offer
  • Propensity score matching for pre-college scholarship programs
  • Differences in differences evaluation of community learning centers (CLCs)

Caveats

  • Large datasets can work against you by giving false precision to estimates due to large sample size
  • Large datasets can make robust techniques of estimating errors trickier such as Bayesian estimation, simulation, and model visualization
  • All the caveats about selection on unobservables apply. Content expertise will help you evaluate how credible of a validity threat this provides.

Come work for DPI

  • DPI is hiring a research analyst part time from Novemeber/December through the spring
  • Flexible hours (10-30 hours a week, depending on availability)
  • Work on a variety of data analyses in support of DPI budget, legislative, and administrative initiatives
  • To apply, send me a short resume and cover letter jared.knowles@dpi.wi.gov